Overview

Dataset statistics

Number of variables27
Number of observations1000
Missing cells3199
Missing cells (%)11.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory251.0 KiB
Average record size in memory257.1 B

Variable types

Numeric4
Categorical23

Alerts

Telephone has constant value "Registered under the applicant's name" Constant
loan_application_id has a high cardinality: 1000 distinct values High cardinality
Months_loan_taken_for is highly correlated with Principal_loan_amountHigh correlation
Principal_loan_amount is highly correlated with Months_loan_taken_forHigh correlation
Months_loan_taken_for is highly correlated with Principal_loan_amountHigh correlation
Principal_loan_amount is highly correlated with Months_loan_taken_forHigh correlation
applicant_id is highly correlated with Number_of_dependents and 5 other fieldsHigh correlation
Primary_applicant_age_in_years is highly correlated with Number_of_dependents and 4 other fieldsHigh correlation
Number_of_dependents is highly correlated with applicant_id and 5 other fieldsHigh correlation
Years_at_current_residence is highly correlated with Foreign_worker and 3 other fieldsHigh correlation
Foreign_worker is highly correlated with applicant_id and 7 other fieldsHigh correlation
Months_loan_taken_for is highly correlated with Has_coapplicant and 1 other fieldsHigh correlation
Principal_loan_amount is highly correlated with EMI_rate_in_percentage_of_disposable_income and 3 other fieldsHigh correlation
EMI_rate_in_percentage_of_disposable_income is highly correlated with Principal_loan_amount and 4 other fieldsHigh correlation
Has_coapplicant is highly correlated with applicant_id and 10 other fieldsHigh correlation
Has_guarantor is highly correlated with applicant_id and 10 other fieldsHigh correlation
Number_of_existing_loans_at_this_bank is highly correlated with applicant_id and 5 other fieldsHigh correlation
high_risk_applicant is highly correlated with applicant_id and 9 other fieldsHigh correlation
Other_EMI_plans is highly correlated with TelephoneHigh correlation
EMI_rate_in_percentage_of_disposable_income is highly correlated with TelephoneHigh correlation
Telephone is highly correlated with Other_EMI_plans and 20 other fieldsHigh correlation
Savings_account_balance is highly correlated with TelephoneHigh correlation
Number_of_dependents is highly correlated with TelephoneHigh correlation
Property is highly correlated with TelephoneHigh correlation
Has_been_employed_for_at_most is highly correlated with Telephone and 1 other fieldsHigh correlation
Balance_in_existing_bank_account_(upper_limit_of_bucket) is highly correlated with Telephone and 1 other fieldsHigh correlation
Loan_history is highly correlated with TelephoneHigh correlation
Purpose is highly correlated with TelephoneHigh correlation
Has_guarantor is highly correlated with TelephoneHigh correlation
high_risk_applicant is highly correlated with TelephoneHigh correlation
Has_coapplicant is highly correlated with TelephoneHigh correlation
Balance_in_existing_bank_account_(lower_limit_of_bucket) is highly correlated with Telephone and 1 other fieldsHigh correlation
Foreign_worker is highly correlated with TelephoneHigh correlation
Housing is highly correlated with TelephoneHigh correlation
Marital_status is highly correlated with Telephone and 1 other fieldsHigh correlation
Has_been_employed_for_at_least is highly correlated with Telephone and 1 other fieldsHigh correlation
Gender is highly correlated with Telephone and 1 other fieldsHigh correlation
Years_at_current_residence is highly correlated with TelephoneHigh correlation
Number_of_existing_loans_at_this_bank is highly correlated with TelephoneHigh correlation
Employment_status is highly correlated with TelephoneHigh correlation
Gender is highly correlated with Marital_statusHigh correlation
Marital_status is highly correlated with GenderHigh correlation
Years_at_current_residence is highly correlated with Has_been_employed_for_at_leastHigh correlation
Employment_status is highly correlated with Has_been_employed_for_at_mostHigh correlation
Has_been_employed_for_at_least is highly correlated with Years_at_current_residence and 1 other fieldsHigh correlation
Has_been_employed_for_at_most is highly correlated with Employment_status and 1 other fieldsHigh correlation
Months_loan_taken_for is highly correlated with Principal_loan_amountHigh correlation
Principal_loan_amount is highly correlated with Months_loan_taken_forHigh correlation
Has_been_employed_for_at_least has 62 (6.2%) missing values Missing
Has_been_employed_for_at_most has 253 (25.3%) missing values Missing
Telephone has 596 (59.6%) missing values Missing
Savings_account_balance has 183 (18.3%) missing values Missing
Balance_in_existing_bank_account_(lower_limit_of_bucket) has 668 (66.8%) missing values Missing
Balance_in_existing_bank_account_(upper_limit_of_bucket) has 457 (45.7%) missing values Missing
Purpose has 12 (1.2%) missing values Missing
Property has 154 (15.4%) missing values Missing
Other_EMI_plans has 814 (81.4%) missing values Missing
loan_application_id is uniformly distributed Uniform
applicant_id has unique values Unique
loan_application_id has unique values Unique

Reproduction

Analysis started2022-09-16 07:18:20.714216
Analysis finished2022-09-16 07:18:55.259404
Duration34.55 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

applicant_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1514763.121
Minimum1105364
Maximum1903505
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB

Quantile statistics

Minimum1105364
5-th percentile1149540.05
Q11321398
median1529114.5
Q31707751.75
95-th percentile1861018.85
Maximum1903505
Range798141
Interquartile range (IQR)386353.75

Descriptive statistics

Standard deviation228676.3733
Coefficient of variation (CV)0.1509651048
Kurtosis-1.167938801
Mean1514763.121
Median Absolute Deviation (MAD)192824.5
Skewness-0.08540877521
Sum1514763121
Variance5.229288373 × 1010
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14695901
 
0.1%
15547921
 
0.1%
14398871
 
0.1%
12376711
 
0.1%
13522791
 
0.1%
14091971
 
0.1%
14480661
 
0.1%
13371771
 
0.1%
11996611
 
0.1%
14833291
 
0.1%
Other values (990)990
99.0%
ValueCountFrequency (%)
11053641
0.1%
11064111
0.1%
11066881
0.1%
11068011
0.1%
11079101
0.1%
11096121
0.1%
11098611
0.1%
11128261
0.1%
11132311
0.1%
11137031
0.1%
ValueCountFrequency (%)
19035051
0.1%
19029441
0.1%
19025711
0.1%
19025471
0.1%
19023021
0.1%
19018181
0.1%
19011781
0.1%
19000891
0.1%
18947211
0.1%
18937301
0.1%

Primary_applicant_age_in_years
Real number (ℝ≥0)

HIGH CORRELATION

Distinct53
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.546
Minimum19
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB

Quantile statistics

Minimum19
5-th percentile22
Q127
median33
Q342
95-th percentile60
Maximum75
Range56
Interquartile range (IQR)15

Descriptive statistics

Standard deviation11.37546857
Coefficient of variation (CV)0.3200210593
Kurtosis0.5957795671
Mean35.546
Median Absolute Deviation (MAD)7
Skewness1.020739269
Sum35546
Variance129.4012853
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2751
 
5.1%
2650
 
5.0%
2348
 
4.8%
2444
 
4.4%
2843
 
4.3%
2541
 
4.1%
3040
 
4.0%
3540
 
4.0%
3639
 
3.9%
3138
 
3.8%
Other values (43)566
56.6%
ValueCountFrequency (%)
192
 
0.2%
2014
 
1.4%
2114
 
1.4%
2227
2.7%
2348
4.8%
2444
4.4%
2541
4.1%
2650
5.0%
2751
5.1%
2843
4.3%
ValueCountFrequency (%)
752
 
0.2%
744
0.4%
701
 
0.1%
683
 
0.3%
673
 
0.3%
665
0.5%
655
0.5%
645
0.5%
638
0.8%
622
 
0.2%

Gender
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
male
690 
female
310 

Length

Max length6
Median length4
Mean length4.62
Min length4

Characters and Unicode

Total characters4620
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowfemale
3rd rowmale
4th rowmale
5th rowmale

Common Values

ValueCountFrequency (%)
male690
69.0%
female310
31.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
male690
69.0%
female310
31.0%

Most occurring characters

ValueCountFrequency (%)
e1310
28.4%
m1000
21.6%
a1000
21.6%
l1000
21.6%
f310
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4620
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1310
28.4%
m1000
21.6%
a1000
21.6%
l1000
21.6%
f310
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Latin4620
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1310
28.4%
m1000
21.6%
a1000
21.6%
l1000
21.6%
f310
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII4620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1310
28.4%
m1000
21.6%
a1000
21.6%
l1000
21.6%
f310
 
6.7%

Marital_status
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
single
548 
divorced/separated/married
310 
married/widowed
92 
divorced/separated
 
50

Length

Max length26
Median length6
Mean length13.628
Min length6

Characters and Unicode

Total characters13628
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsingle
2nd rowdivorced/separated/married
3rd rowsingle
4th rowsingle
5th rowsingle

Common Values

ValueCountFrequency (%)
single548
54.8%
divorced/separated/married310
31.0%
married/widowed92
 
9.2%
divorced/separated50
 
5.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
single548
54.8%
divorced/separated/married310
31.0%
married/widowed92
 
9.2%
divorced/separated50
 
5.0%

Most occurring characters

ValueCountFrequency (%)
e2122
15.6%
d1666
12.2%
r1524
11.2%
i1402
10.3%
a1122
8.2%
s908
 
6.7%
/762
 
5.6%
n548
 
4.0%
g548
 
4.0%
l548
 
4.0%
Other values (7)2478
18.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12866
94.4%
Other Punctuation762
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2122
16.5%
d1666
12.9%
r1524
11.8%
i1402
10.9%
a1122
8.7%
s908
7.1%
n548
 
4.3%
g548
 
4.3%
l548
 
4.3%
o452
 
3.5%
Other values (6)2026
15.7%
Other Punctuation
ValueCountFrequency (%)
/762
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin12866
94.4%
Common762
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2122
16.5%
d1666
12.9%
r1524
11.8%
i1402
10.9%
a1122
8.7%
s908
7.1%
n548
 
4.3%
g548
 
4.3%
l548
 
4.3%
o452
 
3.5%
Other values (6)2026
15.7%
Common
ValueCountFrequency (%)
/762
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII13628
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2122
15.6%
d1666
12.2%
r1524
11.2%
i1402
10.3%
a1122
8.2%
s908
 
6.7%
/762
 
5.6%
n548
 
4.0%
g548
 
4.0%
l548
 
4.0%
Other values (7)2478
18.2%

Number_of_dependents
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
1
845 
2
155 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1845
84.5%
2155
 
15.5%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
1845
84.5%
2155
 
15.5%

Most occurring characters

ValueCountFrequency (%)
1845
84.5%
2155
 
15.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1845
84.5%
2155
 
15.5%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1845
84.5%
2155
 
15.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1845
84.5%
2155
 
15.5%

Housing
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
own
713 
rent
179 
for free
108 

Length

Max length8
Median length3
Mean length3.719
Min length3

Characters and Unicode

Total characters3719
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowown
2nd rowown
3rd rowown
4th rowfor free
5th rowfor free

Common Values

ValueCountFrequency (%)
own713
71.3%
rent179
 
17.9%
for free108
 
10.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
own713
64.4%
rent179
 
16.2%
for108
 
9.7%
free108
 
9.7%

Most occurring characters

ValueCountFrequency (%)
n892
24.0%
o821
22.1%
w713
19.2%
r395
10.6%
e395
10.6%
f216
 
5.8%
t179
 
4.8%
108
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3611
97.1%
Space Separator108
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n892
24.7%
o821
22.7%
w713
19.7%
r395
10.9%
e395
10.9%
f216
 
6.0%
t179
 
5.0%
Space Separator
ValueCountFrequency (%)
108
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3611
97.1%
Common108
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
n892
24.7%
o821
22.7%
w713
19.7%
r395
10.9%
e395
10.9%
f216
 
6.0%
t179
 
5.0%
Common
ValueCountFrequency (%)
108
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3719
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n892
24.0%
o821
22.1%
w713
19.2%
r395
10.6%
e395
10.6%
f216
 
5.8%
t179
 
4.8%
108
 
2.9%

Years_at_current_residence
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
4
413 
2
308 
3
149 
1
130 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row2
3rd row3
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4413
41.3%
2308
30.8%
3149
 
14.9%
1130
 
13.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
4413
41.3%
2308
30.8%
3149
 
14.9%
1130
 
13.0%

Most occurring characters

ValueCountFrequency (%)
4413
41.3%
2308
30.8%
3149
 
14.9%
1130
 
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4413
41.3%
2308
30.8%
3149
 
14.9%
1130
 
13.0%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4413
41.3%
2308
30.8%
3149
 
14.9%
1130
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4413
41.3%
2308
30.8%
3149
 
14.9%
1130
 
13.0%

Employment_status
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
skilled employee / official
630 
unskilled - resident
200 
management / self-employed / highly qualified employee / officer
148 
unemployed / unskilled - non-resident
 
22

Length

Max length64
Median length27
Mean length31.296
Min length20

Characters and Unicode

Total characters31296
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowskilled employee / official
2nd rowskilled employee / official
3rd rowunskilled - resident
4th rowskilled employee / official
5th rowskilled employee / official

Common Values

ValueCountFrequency (%)
skilled employee / official630
63.0%
unskilled - resident200
 
20.0%
management / self-employed / highly qualified employee / officer148
 
14.8%
unemployed / unskilled - non-resident22
 
2.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
1318
28.9%
employee778
17.1%
skilled630
13.8%
official630
13.8%
unskilled222
 
4.9%
resident200
 
4.4%
management148
 
3.2%
self-employed148
 
3.2%
highly148
 
3.2%
qualified148
 
3.2%
Other values (3)192
 
4.2%

Most occurring characters

ValueCountFrequency (%)
e4710
15.0%
l3726
11.9%
3562
11.4%
i2926
 
9.3%
f1852
 
5.9%
o1748
 
5.6%
d1392
 
4.4%
m1244
 
4.0%
s1222
 
3.9%
y1096
 
3.5%
Other values (13)7818
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter26246
83.9%
Space Separator3562
 
11.4%
Other Punctuation1096
 
3.5%
Dash Punctuation392
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e4710
17.9%
l3726
14.2%
i2926
11.1%
f1852
 
7.1%
o1748
 
6.7%
d1392
 
5.3%
m1244
 
4.7%
s1222
 
4.7%
y1096
 
4.2%
a1074
 
4.1%
Other values (10)5256
20.0%
Space Separator
ValueCountFrequency (%)
3562
100.0%
Other Punctuation
ValueCountFrequency (%)
/1096
100.0%
Dash Punctuation
ValueCountFrequency (%)
-392
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin26246
83.9%
Common5050
 
16.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e4710
17.9%
l3726
14.2%
i2926
11.1%
f1852
 
7.1%
o1748
 
6.7%
d1392
 
5.3%
m1244
 
4.7%
s1222
 
4.7%
y1096
 
4.2%
a1074
 
4.1%
Other values (10)5256
20.0%
Common
ValueCountFrequency (%)
3562
70.5%
/1096
 
21.7%
-392
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII31296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e4710
15.0%
l3726
11.9%
3562
11.4%
i2926
 
9.3%
f1852
 
5.9%
o1748
 
5.6%
d1392
 
4.4%
m1244
 
4.0%
s1222
 
3.9%
y1096
 
3.5%
Other values (13)7818
25.0%

Has_been_employed_for_at_least
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)0.4%
Missing62
Missing (%)6.2%
Memory size47.9 KiB
1 year
339 
7 years
253 
4 years
174 
0 year
172 

Length

Max length7
Median length6
Mean length6.455223881
Min length6

Characters and Unicode

Total characters6055
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7 years
2nd row1 year
3rd row4 years
4th row4 years
5th row1 year

Common Values

ValueCountFrequency (%)
1 year339
33.9%
7 years253
25.3%
4 years174
17.4%
0 year172
17.2%
(Missing)62
 
6.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
year511
27.2%
years427
22.8%
1339
18.1%
7253
13.5%
4174
 
9.3%
0172
 
9.2%

Most occurring characters

ValueCountFrequency (%)
938
15.5%
y938
15.5%
e938
15.5%
a938
15.5%
r938
15.5%
s427
7.1%
1339
 
5.6%
7253
 
4.2%
4174
 
2.9%
0172
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4179
69.0%
Space Separator938
 
15.5%
Decimal Number938
 
15.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y938
22.4%
e938
22.4%
a938
22.4%
r938
22.4%
s427
10.2%
Decimal Number
ValueCountFrequency (%)
1339
36.1%
7253
27.0%
4174
18.6%
0172
18.3%
Space Separator
ValueCountFrequency (%)
938
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4179
69.0%
Common1876
31.0%

Most frequent character per script

Common
ValueCountFrequency (%)
938
50.0%
1339
 
18.1%
7253
 
13.5%
4174
 
9.3%
0172
 
9.2%
Latin
ValueCountFrequency (%)
y938
22.4%
e938
22.4%
a938
22.4%
r938
22.4%
s427
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII6055
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
938
15.5%
y938
15.5%
e938
15.5%
a938
15.5%
r938
15.5%
s427
7.1%
1339
 
5.6%
7253
 
4.2%
4174
 
2.9%
0172
 
2.8%

Has_been_employed_for_at_most
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)0.5%
Missing253
Missing (%)25.3%
Memory size47.9 KiB
4 years
339 
7 years
174 
1 year
172 
0 year
62 

Length

Max length7
Median length7
Mean length6.686746988
Min length6

Characters and Unicode

Total characters4995
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4 years
2nd row7 years
3rd row7 years
4th row4 years
5th row4 years

Common Values

ValueCountFrequency (%)
4 years339
33.9%
7 years174
17.4%
1 year172
17.2%
0 year62
 
6.2%
(Missing)253
25.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
years513
34.3%
4339
22.7%
year234
15.7%
7174
 
11.6%
1172
 
11.5%
062
 
4.1%

Most occurring characters

ValueCountFrequency (%)
747
15.0%
y747
15.0%
e747
15.0%
a747
15.0%
r747
15.0%
s513
10.3%
4339
6.8%
7174
 
3.5%
1172
 
3.4%
062
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3501
70.1%
Space Separator747
 
15.0%
Decimal Number747
 
15.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y747
21.3%
e747
21.3%
a747
21.3%
r747
21.3%
s513
14.7%
Decimal Number
ValueCountFrequency (%)
4339
45.4%
7174
23.3%
1172
23.0%
062
 
8.3%
Space Separator
ValueCountFrequency (%)
747
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3501
70.1%
Common1494
29.9%

Most frequent character per script

Common
ValueCountFrequency (%)
747
50.0%
4339
22.7%
7174
 
11.6%
1172
 
11.5%
062
 
4.1%
Latin
ValueCountFrequency (%)
y747
21.3%
e747
21.3%
a747
21.3%
r747
21.3%
s513
14.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII4995
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
747
15.0%
y747
15.0%
e747
15.0%
a747
15.0%
r747
15.0%
s513
10.3%
4339
6.8%
7174
 
3.5%
1172
 
3.4%
062
 
1.2%

Telephone
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)0.2%
Missing596
Missing (%)59.6%
Memory size47.9 KiB
Registered under the applicant's name
404 

Length

Max length37
Median length37
Mean length37
Min length37

Characters and Unicode

Total characters14948
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRegistered under the applicant's name
2nd rowRegistered under the applicant's name
3rd rowRegistered under the applicant's name
4th rowRegistered under the applicant's name
5th rowRegistered under the applicant's name

Common Values

ValueCountFrequency (%)
Registered under the applicant's name404
40.4%
(Missing)596
59.6%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
registered404
20.0%
under404
20.0%
the404
20.0%
applicant's404
20.0%
name404
20.0%

Most occurring characters

ValueCountFrequency (%)
e2424
16.2%
1616
10.8%
t1212
 
8.1%
n1212
 
8.1%
a1212
 
8.1%
i808
 
5.4%
s808
 
5.4%
r808
 
5.4%
d808
 
5.4%
p808
 
5.4%
Other values (8)3232
21.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12524
83.8%
Space Separator1616
 
10.8%
Uppercase Letter404
 
2.7%
Other Punctuation404
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2424
19.4%
t1212
9.7%
n1212
9.7%
a1212
9.7%
i808
 
6.5%
s808
 
6.5%
r808
 
6.5%
d808
 
6.5%
p808
 
6.5%
l404
 
3.2%
Other values (5)2020
16.1%
Space Separator
ValueCountFrequency (%)
1616
100.0%
Uppercase Letter
ValueCountFrequency (%)
R404
100.0%
Other Punctuation
ValueCountFrequency (%)
'404
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin12928
86.5%
Common2020
 
13.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2424
18.8%
t1212
9.4%
n1212
9.4%
a1212
9.4%
i808
 
6.2%
s808
 
6.2%
r808
 
6.2%
d808
 
6.2%
p808
 
6.2%
R404
 
3.1%
Other values (6)2424
18.8%
Common
ValueCountFrequency (%)
1616
80.0%
'404
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII14948
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2424
16.2%
1616
10.8%
t1212
 
8.1%
n1212
 
8.1%
a1212
 
8.1%
i808
 
5.4%
s808
 
5.4%
r808
 
5.4%
d808
 
5.4%
p808
 
5.4%
Other values (8)3232
21.6%

Foreign_worker
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
1
963 
0
 
37

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1963
96.3%
037
 
3.7%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
1963
96.3%
037
 
3.7%

Most occurring characters

ValueCountFrequency (%)
1963
96.3%
037
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1963
96.3%
037
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1963
96.3%
037
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1963
96.3%
037
 
3.7%

Savings_account_balance
Categorical

HIGH CORRELATION
MISSING

Distinct4
Distinct (%)0.5%
Missing183
Missing (%)18.3%
Memory size47.9 KiB
Low
603 
Medium
103 
High
63 
Very high
 
48

Length

Max length9
Median length3
Mean length3.807833537
Min length3

Characters and Unicode

Total characters3111
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLow
2nd rowLow
3rd rowLow
4th rowLow
5th rowHigh

Common Values

ValueCountFrequency (%)
Low603
60.3%
Medium103
 
10.3%
High63
 
6.3%
Very high48
 
4.8%
(Missing)183
 
18.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
low603
69.7%
high111
 
12.8%
medium103
 
11.9%
very48
 
5.5%

Most occurring characters

ValueCountFrequency (%)
L603
19.4%
o603
19.4%
w603
19.4%
i214
 
6.9%
h159
 
5.1%
e151
 
4.9%
g111
 
3.6%
M103
 
3.3%
d103
 
3.3%
u103
 
3.3%
Other values (6)358
11.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2246
72.2%
Uppercase Letter817
 
26.3%
Space Separator48
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o603
26.8%
w603
26.8%
i214
 
9.5%
h159
 
7.1%
e151
 
6.7%
g111
 
4.9%
d103
 
4.6%
u103
 
4.6%
m103
 
4.6%
r48
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
L603
73.8%
M103
 
12.6%
H63
 
7.7%
V48
 
5.9%
Space Separator
ValueCountFrequency (%)
48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3063
98.5%
Common48
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
L603
19.7%
o603
19.7%
w603
19.7%
i214
 
7.0%
h159
 
5.2%
e151
 
4.9%
g111
 
3.6%
M103
 
3.4%
d103
 
3.4%
u103
 
3.4%
Other values (5)310
10.1%
Common
ValueCountFrequency (%)
48
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3111
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L603
19.4%
o603
19.4%
w603
19.4%
i214
 
6.9%
h159
 
5.1%
e151
 
4.9%
g111
 
3.6%
M103
 
3.3%
d103
 
3.3%
u103
 
3.3%
Other values (6)358
11.5%
Distinct2
Distinct (%)0.6%
Missing668
Missing (%)66.8%
Memory size47.9 KiB
0
269 
2 lac
63 

Length

Max length5
Median length1
Mean length1.759036145
Min length1

Characters and Unicode

Total characters584
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0269
26.9%
2 lac63
 
6.3%
(Missing)668
66.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
0269
68.1%
263
 
15.9%
lac63
 
15.9%

Most occurring characters

ValueCountFrequency (%)
0269
46.1%
263
 
10.8%
63
 
10.8%
l63
 
10.8%
a63
 
10.8%
c63
 
10.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number332
56.8%
Lowercase Letter189
32.4%
Space Separator63
 
10.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l63
33.3%
a63
33.3%
c63
33.3%
Decimal Number
ValueCountFrequency (%)
0269
81.0%
263
 
19.0%
Space Separator
ValueCountFrequency (%)
63
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common395
67.6%
Latin189
32.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0269
68.1%
263
 
15.9%
63
 
15.9%
Latin
ValueCountFrequency (%)
l63
33.3%
a63
33.3%
c63
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII584
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0269
46.1%
263
 
10.8%
63
 
10.8%
l63
 
10.8%
a63
 
10.8%
c63
 
10.8%
Distinct2
Distinct (%)0.4%
Missing457
Missing (%)45.7%
Memory size47.9 KiB
0
274 
2 lac
269 

Length

Max length5
Median length1
Mean length2.981583794
Min length1

Characters and Unicode

Total characters1619
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row2 lac
3rd row0
4th row0
5th row2 lac

Common Values

ValueCountFrequency (%)
0274
27.4%
2 lac269
26.9%
(Missing)457
45.7%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
0274
33.7%
2269
33.1%
lac269
33.1%

Most occurring characters

ValueCountFrequency (%)
0274
16.9%
2269
16.6%
269
16.6%
l269
16.6%
a269
16.6%
c269
16.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter807
49.8%
Decimal Number543
33.5%
Space Separator269
 
16.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l269
33.3%
a269
33.3%
c269
33.3%
Decimal Number
ValueCountFrequency (%)
0274
50.5%
2269
49.5%
Space Separator
ValueCountFrequency (%)
269
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common812
50.2%
Latin807
49.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0274
33.7%
2269
33.1%
269
33.1%
Latin
ValueCountFrequency (%)
l269
33.3%
a269
33.3%
c269
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1619
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0274
16.9%
2269
16.6%
269
16.6%
l269
16.6%
a269
16.6%
c269
16.6%

loan_application_id
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
d68d975e-edad-11ea-8761-1d6f9c1ff461
 
1
d68f0dd2-edad-11ea-8785-076f1a6b0e1d
 
1
d68f06d4-edad-11ea-8933-08c4e91e1cae
 
1
d68f0760-edad-11ea-9800-185713046169
 
1
d68f07ec-edad-11ea-b6db-3496b8dbd5c8
 
1
Other values (995)
995 

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters36000
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1000 ?
Unique (%)100.0%

Sample

1st rowd68d975e-edad-11ea-8761-1d6f9c1ff461
2nd rowd68d989e-edad-11ea-b1d5-2bcf65006448
3rd rowd68d995c-edad-11ea-814a-1b6716782575
4th rowd68d99fc-edad-11ea-8841-17e8848060ae
5th rowd68d9a92-edad-11ea-9f3d-1f8682db006a

Common Values

ValueCountFrequency (%)
d68d975e-edad-11ea-8761-1d6f9c1ff4611
 
0.1%
d68f0dd2-edad-11ea-8785-076f1a6b0e1d1
 
0.1%
d68f06d4-edad-11ea-8933-08c4e91e1cae1
 
0.1%
d68f0760-edad-11ea-9800-1857130461691
 
0.1%
d68f07ec-edad-11ea-b6db-3496b8dbd5c81
 
0.1%
d68f086e-edad-11ea-a24f-12b786b3a9931
 
0.1%
d68f08fa-edad-11ea-981a-51ccea0b87181
 
0.1%
d68f0986-edad-11ea-802f-20d1fc935ad11
 
0.1%
d68f0a08-edad-11ea-9f98-4c6666986d431
 
0.1%
d68f0a94-edad-11ea-a0cb-4ef9f8e6c4781
 
0.1%
Other values (990)990
99.0%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
d68d975e-edad-11ea-8761-1d6f9c1ff4611
 
0.1%
d68d9f74-edad-11ea-bd59-102afb4e83031
 
0.1%
d68da88e-edad-11ea-911c-45363b9e71a71
 
0.1%
d68da802-edad-11ea-8fb1-430f7bd151801
 
0.1%
d68d995c-edad-11ea-814a-1b67167825751
 
0.1%
d68d99fc-edad-11ea-8841-17e8848060ae1
 
0.1%
d68d9a92-edad-11ea-9f3d-1f8682db006a1
 
0.1%
d68d9b1e-edad-11ea-8b43-2b6a0308d4871
 
0.1%
d68d9bb4-edad-11ea-bb16-0490ef14f12e1
 
0.1%
d68d9c40-edad-11ea-b46c-5067ccf3672a1
 
0.1%
Other values (990)990
99.0%

Most occurring characters

ValueCountFrequency (%)
d4273
11.9%
-4000
11.1%
e3575
9.9%
a3492
9.7%
13231
 
9.0%
82452
 
6.8%
62149
 
6.0%
f1406
 
3.9%
21380
 
3.8%
41373
 
3.8%
Other values (7)8669
24.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16755
46.5%
Lowercase Letter15245
42.3%
Dash Punctuation4000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
13231
19.3%
82452
14.6%
62149
12.8%
21380
8.2%
41373
8.2%
01358
8.1%
91331
7.9%
31268
 
7.6%
51161
 
6.9%
71052
 
6.3%
Lowercase Letter
ValueCountFrequency (%)
d4273
28.0%
e3575
23.5%
a3492
22.9%
f1406
 
9.2%
b1334
 
8.8%
c1165
 
7.6%
Dash Punctuation
ValueCountFrequency (%)
-4000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common20755
57.7%
Latin15245
42.3%

Most frequent character per script

Common
ValueCountFrequency (%)
-4000
19.3%
13231
15.6%
82452
11.8%
62149
10.4%
21380
 
6.6%
41373
 
6.6%
01358
 
6.5%
91331
 
6.4%
31268
 
6.1%
51161
 
5.6%
Latin
ValueCountFrequency (%)
d4273
28.0%
e3575
23.5%
a3492
22.9%
f1406
 
9.2%
b1334
 
8.8%
c1165
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII36000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d4273
11.9%
-4000
11.1%
e3575
9.9%
a3492
9.7%
13231
 
9.0%
82452
 
6.8%
62149
 
6.0%
f1406
 
3.9%
21380
 
3.8%
41373
 
3.8%
Other values (7)8669
24.1%

Months_loan_taken_for
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct33
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.903
Minimum4
Maximum72
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB

Quantile statistics

Minimum4
5-th percentile6
Q112
median18
Q324
95-th percentile48
Maximum72
Range68
Interquartile range (IQR)12

Descriptive statistics

Standard deviation12.05881445
Coefficient of variation (CV)0.5768939603
Kurtosis0.9197813601
Mean20.903
Median Absolute Deviation (MAD)6
Skewness1.094184172
Sum20903
Variance145.415006
MonotonicityNot monotonic
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
24184
18.4%
12179
17.9%
18113
11.3%
3683
8.3%
675
7.5%
1564
 
6.4%
949
 
4.9%
4848
 
4.8%
3040
 
4.0%
2130
 
3.0%
Other values (23)135
13.5%
ValueCountFrequency (%)
46
 
0.6%
51
 
0.1%
675
7.5%
75
 
0.5%
87
 
0.7%
949
 
4.9%
1028
 
2.8%
119
 
0.9%
12179
17.9%
134
 
0.4%
ValueCountFrequency (%)
721
 
0.1%
6013
 
1.3%
542
 
0.2%
4848
4.8%
471
 
0.1%
455
 
0.5%
4211
 
1.1%
401
 
0.1%
395
 
0.5%
3683
8.3%

Purpose
Categorical

HIGH CORRELATION
MISSING

Distinct9
Distinct (%)0.9%
Missing12
Missing (%)1.2%
Memory size47.9 KiB
electronic equipment
280 
new vehicle
234 
FF&E
181 
used vehicle
103 
business
97 
Other values (4)
93 

Length

Max length20
Median length19
Mean length12.15991903
Min length4

Characters and Unicode

Total characters12014
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowelectronic equipment
2nd rowelectronic equipment
3rd roweducation
4th rowFF&E
5th rownew vehicle

Common Values

ValueCountFrequency (%)
electronic equipment280
28.0%
new vehicle234
23.4%
FF&E181
18.1%
used vehicle103
 
10.3%
business97
 
9.7%
education50
 
5.0%
repair costs22
 
2.2%
domestic appliances12
 
1.2%
career development9
 
0.9%
(Missing)12
 
1.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
vehicle337
20.4%
electronic280
17.0%
equipment280
17.0%
new234
14.2%
ff&e181
11.0%
used103
 
6.2%
business97
 
5.9%
education50
 
3.0%
repair22
 
1.3%
costs22
 
1.3%
Other values (4)42
 
2.5%

Most occurring characters

ValueCountFrequency (%)
e2369
19.7%
i1090
 
9.1%
c1002
 
8.3%
n962
 
8.0%
660
 
5.5%
t653
 
5.4%
l638
 
5.3%
u530
 
4.4%
s462
 
3.8%
o373
 
3.1%
Other values (13)3275
27.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter10630
88.5%
Space Separator660
 
5.5%
Uppercase Letter543
 
4.5%
Other Punctuation181
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2369
22.3%
i1090
10.3%
c1002
9.4%
n962
9.0%
t653
 
6.1%
l638
 
6.0%
u530
 
5.0%
s462
 
4.3%
o373
 
3.5%
v346
 
3.3%
Other values (9)2205
20.7%
Uppercase Letter
ValueCountFrequency (%)
F362
66.7%
E181
33.3%
Space Separator
ValueCountFrequency (%)
660
100.0%
Other Punctuation
ValueCountFrequency (%)
&181
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11173
93.0%
Common841
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2369
21.2%
i1090
 
9.8%
c1002
 
9.0%
n962
 
8.6%
t653
 
5.8%
l638
 
5.7%
u530
 
4.7%
s462
 
4.1%
o373
 
3.3%
F362
 
3.2%
Other values (11)2732
24.5%
Common
ValueCountFrequency (%)
660
78.5%
&181
 
21.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII12014
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2369
19.7%
i1090
 
9.1%
c1002
 
8.3%
n962
 
8.0%
660
 
5.5%
t653
 
5.4%
l638
 
5.3%
u530
 
4.4%
s462
 
3.8%
o373
 
3.1%
Other values (13)3275
27.3%

Principal_loan_amount
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct921
Distinct (%)92.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3271258
Minimum250000
Maximum18424000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB

Quantile statistics

Minimum250000
5-th percentile708950
Q11365500
median2319500
Q33972250
95-th percentile9162700
Maximum18424000
Range18174000
Interquartile range (IQR)2606750

Descriptive statistics

Standard deviation2822736.876
Coefficient of variation (CV)0.8628903241
Kurtosis4.292590308
Mean3271258
Median Absolute Deviation (MAD)1097500
Skewness1.94962768
Sum3271258000
Variance7.967843471 × 1012
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14780003
 
0.3%
12620003
 
0.3%
12580003
 
0.3%
12750003
 
0.3%
13930003
 
0.3%
14420002
 
0.2%
35900002
 
0.2%
25780002
 
0.2%
7010002
 
0.2%
19240002
 
0.2%
Other values (911)975
97.5%
ValueCountFrequency (%)
2500001
0.1%
2760001
0.1%
3380001
0.1%
3390001
0.1%
3430001
0.1%
3620001
0.1%
3680001
0.1%
3850001
0.1%
3920001
0.1%
4090001
0.1%
ValueCountFrequency (%)
184240001
0.1%
159450001
0.1%
158570001
0.1%
156720001
0.1%
156530001
0.1%
148960001
0.1%
147820001
0.1%
145550001
0.1%
144210001
0.1%
143180001
0.1%

EMI_rate_in_percentage_of_disposable_income
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
4
476 
2
231 
3
157 
1
136 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row2
3rd row2
4th row2
5th row3

Common Values

ValueCountFrequency (%)
4476
47.6%
2231
23.1%
3157
 
15.7%
1136
 
13.6%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
4476
47.6%
2231
23.1%
3157
 
15.7%
1136
 
13.6%

Most occurring characters

ValueCountFrequency (%)
4476
47.6%
2231
23.1%
3157
 
15.7%
1136
 
13.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4476
47.6%
2231
23.1%
3157
 
15.7%
1136
 
13.6%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4476
47.6%
2231
23.1%
3157
 
15.7%
1136
 
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4476
47.6%
2231
23.1%
3157
 
15.7%
1136
 
13.6%

Property
Categorical

HIGH CORRELATION
MISSING

Distinct3
Distinct (%)0.4%
Missing154
Missing (%)15.4%
Memory size47.9 KiB
car or other
332 
real estate
282 
building society savings agreement/life insurance
232 

Length

Max length49
Median length12
Mean length21.81323877
Min length11

Characters and Unicode

Total characters18454
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowreal estate
2nd rowreal estate
3rd rowreal estate
4th rowbuilding society savings agreement/life insurance
5th rowbuilding society savings agreement/life insurance

Common Values

ValueCountFrequency (%)
car or other332
33.2%
real estate282
28.2%
building society savings agreement/life insurance232
23.2%
(Missing)154
15.4%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
car332
12.2%
or332
12.2%
other332
12.2%
real282
10.4%
estate282
10.4%
building232
8.5%
society232
8.5%
savings232
8.5%
agreement/life232
8.5%
insurance232
8.5%

Most occurring characters

ValueCountFrequency (%)
e2570
13.9%
1874
10.2%
r1742
9.4%
a1592
8.6%
i1392
 
7.5%
t1360
 
7.4%
s1210
 
6.6%
n1160
 
6.3%
o896
 
4.9%
c796
 
4.3%
Other values (11)3862
20.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16348
88.6%
Space Separator1874
 
10.2%
Other Punctuation232
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2570
15.7%
r1742
10.7%
a1592
9.7%
i1392
8.5%
t1360
8.3%
s1210
7.4%
n1160
7.1%
o896
 
5.5%
c796
 
4.9%
l746
 
4.6%
Other values (9)2884
17.6%
Space Separator
ValueCountFrequency (%)
1874
100.0%
Other Punctuation
ValueCountFrequency (%)
/232
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin16348
88.6%
Common2106
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2570
15.7%
r1742
10.7%
a1592
9.7%
i1392
8.5%
t1360
8.3%
s1210
7.4%
n1160
7.1%
o896
 
5.5%
c796
 
4.9%
l746
 
4.6%
Other values (9)2884
17.6%
Common
ValueCountFrequency (%)
1874
89.0%
/232
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII18454
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2570
13.9%
1874
10.2%
r1742
9.4%
a1592
8.6%
i1392
 
7.5%
t1360
 
7.4%
s1210
 
6.6%
n1160
 
6.3%
o896
 
4.9%
c796
 
4.3%
Other values (11)3862
20.9%

Has_coapplicant
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
0
959 
1
 
41

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0959
95.9%
141
 
4.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
0959
95.9%
141
 
4.1%

Most occurring characters

ValueCountFrequency (%)
0959
95.9%
141
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0959
95.9%
141
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0959
95.9%
141
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0959
95.9%
141
 
4.1%

Has_guarantor
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
0
948 
1
 
52

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0948
94.8%
152
 
5.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
0948
94.8%
152
 
5.2%

Most occurring characters

ValueCountFrequency (%)
0948
94.8%
152
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0948
94.8%
152
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0948
94.8%
152
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0948
94.8%
152
 
5.2%

Other_EMI_plans
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)1.1%
Missing814
Missing (%)81.4%
Memory size47.9 KiB
bank
139 
stores
47 

Length

Max length6
Median length4
Mean length4.505376344
Min length4

Characters and Unicode

Total characters838
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowbank
2nd rowbank
3rd rowbank
4th rowstores
5th rowbank

Common Values

ValueCountFrequency (%)
bank139
 
13.9%
stores47
 
4.7%
(Missing)814
81.4%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
bank139
74.7%
stores47
 
25.3%

Most occurring characters

ValueCountFrequency (%)
b139
16.6%
a139
16.6%
n139
16.6%
k139
16.6%
s94
11.2%
t47
 
5.6%
o47
 
5.6%
r47
 
5.6%
e47
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter838
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
b139
16.6%
a139
16.6%
n139
16.6%
k139
16.6%
s94
11.2%
t47
 
5.6%
o47
 
5.6%
r47
 
5.6%
e47
 
5.6%

Most occurring scripts

ValueCountFrequency (%)
Latin838
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
b139
16.6%
a139
16.6%
n139
16.6%
k139
16.6%
s94
11.2%
t47
 
5.6%
o47
 
5.6%
r47
 
5.6%
e47
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b139
16.6%
a139
16.6%
n139
16.6%
k139
16.6%
s94
11.2%
t47
 
5.6%
o47
 
5.6%
r47
 
5.6%
e47
 
5.6%

Number_of_existing_loans_at_this_bank
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
1
633 
2
333 
3
 
28
4
 
6

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1633
63.3%
2333
33.3%
328
 
2.8%
46
 
0.6%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
1633
63.3%
2333
33.3%
328
 
2.8%
46
 
0.6%

Most occurring characters

ValueCountFrequency (%)
1633
63.3%
2333
33.3%
328
 
2.8%
46
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1633
63.3%
2333
33.3%
328
 
2.8%
46
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1633
63.3%
2333
33.3%
328
 
2.8%
46
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1633
63.3%
2333
33.3%
328
 
2.8%
46
 
0.6%

Loan_history
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
existing loans paid back duly till now
530 
critical/pending loans at other banks
293 
delay in paying off loans in the past
88 
all loans at this bank paid back duly
 
49
no loans taken/all loans paid back duly
 
40

Length

Max length39
Median length38
Mean length37.61
Min length37

Characters and Unicode

Total characters37610
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcritical/pending loans at other banks
2nd rowexisting loans paid back duly till now
3rd rowcritical/pending loans at other banks
4th rowexisting loans paid back duly till now
5th rowdelay in paying off loans in the past

Common Values

ValueCountFrequency (%)
existing loans paid back duly till now530
53.0%
critical/pending loans at other banks293
29.3%
delay in paying off loans in the past88
 
8.8%
all loans at this bank paid back duly49
 
4.9%
no loans taken/all loans paid back duly40
 
4.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
loans1040
15.9%
paid619
9.4%
back619
9.4%
duly619
9.4%
existing530
8.1%
till530
8.1%
now530
8.1%
at342
 
5.2%
critical/pending293
 
4.5%
other293
 
4.5%
Other values (12)1136
17.3%

Most occurring characters

ValueCountFrequency (%)
5551
14.8%
a3648
 
9.7%
i3401
 
9.0%
n3372
 
9.0%
l3278
 
8.7%
t2253
 
6.0%
s2000
 
5.3%
o1991
 
5.3%
d1619
 
4.3%
e1332
 
3.5%
Other values (13)9165
24.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter31726
84.4%
Space Separator5551
 
14.8%
Other Punctuation333
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a3648
11.5%
i3401
10.7%
n3372
10.6%
l3278
10.3%
t2253
 
7.1%
s2000
 
6.3%
o1991
 
6.3%
d1619
 
5.1%
e1332
 
4.2%
c1205
 
3.8%
Other values (11)7627
24.0%
Space Separator
ValueCountFrequency (%)
5551
100.0%
Other Punctuation
ValueCountFrequency (%)
/333
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin31726
84.4%
Common5884
 
15.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a3648
11.5%
i3401
10.7%
n3372
10.6%
l3278
10.3%
t2253
 
7.1%
s2000
 
6.3%
o1991
 
6.3%
d1619
 
5.1%
e1332
 
4.2%
c1205
 
3.8%
Other values (11)7627
24.0%
Common
ValueCountFrequency (%)
5551
94.3%
/333
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII37610
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5551
14.8%
a3648
 
9.7%
i3401
 
9.0%
n3372
 
9.0%
l3278
 
8.7%
t2253
 
6.0%
s2000
 
5.3%
o1991
 
5.3%
d1619
 
4.3%
e1332
 
3.5%
Other values (13)9165
24.4%

high_risk_applicant
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
0
700 
1
300 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0700
70.0%
1300
30.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
0700
70.0%
1300
30.0%

Most occurring characters

ValueCountFrequency (%)
0700
70.0%
1300
30.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0700
70.0%
1300
30.0%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0700
70.0%
1300
30.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0700
70.0%
1300
30.0%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

applicant_idPrimary_applicant_age_in_yearsGenderMarital_statusNumber_of_dependentsHousingYears_at_current_residenceEmployment_statusHas_been_employed_for_at_leastHas_been_employed_for_at_mostTelephoneForeign_workerSavings_account_balanceBalance_in_existing_bank_account_(lower_limit_of_bucket)Balance_in_existing_bank_account_(upper_limit_of_bucket)loan_application_idMonths_loan_taken_forPurposePrincipal_loan_amountEMI_rate_in_percentage_of_disposable_incomePropertyHas_coapplicantHas_guarantorOther_EMI_plansNumber_of_existing_loans_at_this_bankLoan_historyhigh_risk_applicant
0146959067malesingle1own4skilled employee / official7 yearsNaNRegistered under the applicant's name1NaNNaN0d68d975e-edad-11ea-8761-1d6f9c1ff4616electronic equipment11690004real estate00NaN2critical/pending loans at other banks0
1120387322femaledivorced/separated/married1own2skilled employee / official1 year4 yearsNaN1Low02 lacd68d989e-edad-11ea-b1d5-2bcf6500644848electronic equipment59510002real estate00NaN1existing loans paid back duly till now1
2143276149malesingle2own3unskilled - resident4 years7 yearsNaN1LowNaNNaNd68d995c-edad-11ea-814a-1b671678257512education20960002real estate00NaN1critical/pending loans at other banks0
3120758245malesingle2for free4skilled employee / official4 years7 yearsNaN1LowNaN0d68d99fc-edad-11ea-8841-17e8848060ae42FF&E78820002building society savings agreement/life insurance01NaN1existing loans paid back duly till now0
4167443653malesingle2for free4skilled employee / official1 year4 yearsNaN1LowNaN0d68d9a92-edad-11ea-9f3d-1f8682db006a24new vehicle48700003NaN00NaN2delay in paying off loans in the past1
5121397135malesingle2for free4unskilled - resident1 year4 yearsRegistered under the applicant's name1NaNNaNNaNd68d9b1e-edad-11ea-8b43-2b6a0308d48736education90550002NaN00NaN1existing loans paid back duly till now0
6142882253malesingle1own4skilled employee / official7 yearsNaNNaN1HighNaNNaNd68d9bb4-edad-11ea-bb16-0490ef14f12e24FF&E28350003building society savings agreement/life insurance00NaN1existing loans paid back duly till now0
7170573935malesingle1rent2management / self-employed / highly qualified employee / officer1 year4 yearsRegistered under the applicant's name1Low02 lacd68d9c40-edad-11ea-b46c-5067ccf3672a36used vehicle69480002car or other00NaN1existing loans paid back duly till now0
8171516961maledivorced/separated1own4unskilled - resident4 years7 yearsNaN1Very highNaNNaNd68d9cc2-edad-11ea-95a3-19eea692401f12electronic equipment30590002real estate00NaN1existing loans paid back duly till now0
9172299128malemarried/widowed1own2management / self-employed / highly qualified employee / officerNaN0 yearNaN1Low02 lacd68d9d4e-edad-11ea-99f2-2c0022cf7ade30new vehicle52340004car or other00NaN2critical/pending loans at other banks1

Last rows

applicant_idPrimary_applicant_age_in_yearsGenderMarital_statusNumber_of_dependentsHousingYears_at_current_residenceEmployment_statusHas_been_employed_for_at_leastHas_been_employed_for_at_mostTelephoneForeign_workerSavings_account_balanceBalance_in_existing_bank_account_(lower_limit_of_bucket)Balance_in_existing_bank_account_(upper_limit_of_bucket)loan_application_idMonths_loan_taken_forPurposePrincipal_loan_amountEMI_rate_in_percentage_of_disposable_incomePropertyHas_coapplicantHas_guarantorOther_EMI_plansNumber_of_existing_loans_at_this_bankLoan_historyhigh_risk_applicant
990135403437malesingle2own1unskilled - resident0 year1 yearNaN1NaNNaNNaNd68fb912-edad-11ea-a2a8-40ec8a427ec212education35650002building society savings agreement/life insurance00NaN2critical/pending loans at other banks0
991136526734malesingle2own4unskilled - resident7 yearsNaNNaN1MediumNaNNaNd68fb99e-edad-11ea-b9a9-15d10df9edbb15electronic equipment15690004car or other00bank1all loans at this bank paid back duly0
992123770523malemarried/widowed1rent4unskilled - resident4 years7 yearsNaN1NaNNaN0d68fba20-edad-11ea-bb04-189bfb8e51dd18electronic equipment19360002car or other00NaN2existing loans paid back duly till now0
993160968530malesingle1own3management / self-employed / highly qualified employee / officerNaN0 yearRegistered under the applicant's name1LowNaN0d68fbaa2-edad-11ea-b849-58397a50baa936FF&E39590004building society savings agreement/life insurance00NaN1existing loans paid back duly till now0
994161501050malesingle1own3skilled employee / official7 yearsNaNRegistered under the applicant's name1NaNNaNNaNd68fbb24-edad-11ea-8782-4ba4a1f08b1112new vehicle23900004car or other00NaN1existing loans paid back duly till now0
995188019431femaledivorced/separated/married1own4unskilled - resident4 years7 yearsNaN1LowNaNNaNd68fbba6-edad-11ea-80fe-30b2f9300e3d12FF&E17360003real estate00NaN1existing loans paid back duly till now0
996111406440maledivorced/separated1own4management / self-employed / highly qualified employee / officer1 year4 yearsRegistered under the applicant's name1LowNaN0d68fbc28-edad-11ea-bc62-4240ac0824fa30used vehicle38570004building society savings agreement/life insurance00NaN1existing loans paid back duly till now0
997175804638malesingle1own4skilled employee / official7 yearsNaNNaN1LowNaNNaNd68fbcaa-edad-11ea-aafc-2de1139e42cd12electronic equipment8040004car or other00NaN1existing loans paid back duly till now0
998182454523malesingle1for free4skilled employee / official1 year4 yearsRegistered under the applicant's name1LowNaN0d68fbd2c-edad-11ea-b49e-2894666f2df645electronic equipment18450004NaN00NaN1existing loans paid back duly till now1
999166077027malesingle1own4skilled employee / officialNaN0 yearNaN1Medium02 lacd68fbdae-edad-11ea-a2ea-1c661d77d22545used vehicle45760003car or other00NaN1critical/pending loans at other banks0